Grayson County
MedVision: Dataset and Benchmark for Quantitative Medical Image Analysis
Yao, Yongcheng, Zong, Yongshuo, Dutt, Raman, Yang, Yongxin, Tsaftaris, Sotirios A, Hospedales, Timothy
Current vision-language models (VLMs) in medicine are primarily designed for categorical question answering (e.g., "Is this normal or abnormal?") or qualitative descriptive tasks. However, clinical decision-making often relies on quantitative assessments, such as measuring the size of a tumor or the angle of a joint, from which physicians draw their own diagnostic conclusions. This quantitative reasoning capability remains underexplored and poorly supported in existing VLMs. In this work, we introduce MedVision, a large-scale dataset and benchmark specifically designed to evaluate and improve VLMs on quantitative medical image analysis. MedVision spans 22 public datasets covering diverse anatomies and modalities, with 30.8 million image-annotation pairs. We focus on three representative quantitative tasks: (1) detection of anatomical structures and abnormalities, (2) tumor/lesion (T/L) size estimation, and (3) angle/distance (A/D) measurement. Our benchmarks show that current off-the-shelf VLMs perform poorly on these tasks. However, with supervised fine-tuning on MedVision, we significantly enhance their performance across detection, T/L estimation, and A/D measurement, demonstrating reduced error rates and improved precision. This work provides a foundation for developing VLMs with robust quantitative reasoning capabilities in medical imaging. Code and data are available at https://medvision-vlm.github.io.
- Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
- North America > United States > Texas > Grayson County (0.04)
- Europe > Switzerland (0.04)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.66)
- North America > United States > Texas > Grayson County (0.04)
- Asia > Japan > Kyūshū & Okinawa > Okinawa (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Workflow (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science (1.00)
- North America > United States > Texas > Grayson County (0.04)
- Asia > Japan > Kyūshū & Okinawa > Okinawa (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Workflow (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Unsupervised Representation Learning from Sparse Transformation Analysis
Song, Yue, Keller, Thomas Anderson, Yue, Yisong, Perona, Pietro, Welling, Max
There is a vast literature on representation learning based on principles such as coding efficiency, statistical independence, causality, controllability, or symmetry. In this paper we propose to learn representations from sequence data by factorizing the transformations of the latent variables into sparse components. Input data are first encoded as distributions of latent activations and subsequently transformed using a probability flow model, before being decoded to predict a future input state. The flow model is decomposed into a number of rotational (divergence-free) vector fields and a number of potential flow (curl-free) fields. Our sparsity prior encourages only a small number of these fields to be active at any instant and infers the speed with which the probability flows along these fields. Training this model is completely unsupervised using a standard variational objective and results in a new form of disentangled representations where the input is not only represented by a combination of independent factors, but also by a combination of independent transformation primitives given by the learned flow fields. When viewing the transformations as symmetries one may interpret this as learning approximately equivariant representations. Empirically we demonstrate that this model achieves state of the art in terms of both data likelihood and unsupervised approximate equivariance errors on datasets composed of sequence transformations.
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- North America > United States > California (0.04)
- Europe > United Kingdom > North Sea > Southern North Sea (0.04)
- (6 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Vision (0.93)
- Information Technology > Artificial Intelligence > Natural Language (0.68)
Closed-Loop Magnetic Control of Medical Soft Continuum Robots for Deflection
Magnetic soft continuum robots (MSCRs) have emerged as powerful devices in endovascular interventions owing to their hyperelastic fibre matrix and enhanced magnetic manipulability. Effective closed-loop control of tethered magnetic devices contributes to the achievement of autonomous vascular robotic surgery. In this article, we employ a magnetic actuation system equipped with a single rotatable permanent magnet to achieve closed-loop deflection control of the MSCR. To this end, we establish a differential kinematic model of MSCRs exposed to non-uniform magnetic fields. The relationship between the existence and uniqueness of Jacobian and the geometric position between robots is deduced. The control direction induced by Jacobian is demonstrated to be crucial in simulations. Then, the corresponding quasi-static control (QSC) framework integrates a linear extended state observer to estimate model uncertainties. Finally, the effectiveness of the proposed QSC framework is validated through comparative trajectory tracking experiments with the PD controller under external disturbances. Further extensions are made for the Jacobian to path-following control at the distal end position. The proposed control framework prevents the actuator from reaching the joint limit and achieves fast and low error-tracking performance without overshooting.
- Asia > China > Beijing > Beijing (0.05)
- Asia > Singapore (0.04)
- North America > United States > Texas > Grayson County (0.04)
- (5 more...)
Algebraic Machine Learning with an Application to Chemistry
Sai, Ezzeddine El, Gara, Parker, Pflaum, Markus J.
As datasets used in scientific applications become more complex, studying the geometry and topology of data has become an increasingly prevalent part of the data analysis process. This can be seen for example with the growing interest in topological tools such as persistent homology. However, on the one hand, topological tools are inherently limited to providing only coarse information about the underlying space of the data. On the other hand, more geometric approaches rely predominately on the manifold hypothesis, which asserts that the underlying space is a smooth manifold. This assumption fails for many physical models where the underlying space contains singularities. In this paper we develop a machine learning pipeline that captures fine-grain geometric information without having to rely on any smoothness assumptions. Our approach involves working within the scope of algebraic geometry and algebraic varieties instead of differential geometry and smooth manifolds. In the setting of the variety hypothesis, the learning problem becomes to find the underlying variety using sample data. We cast this learning problem into a Maximum A Posteriori optimization problem which we solve in terms of an eigenvalue computation. Having found the underlying variety, we explore the use of Gr\"obner bases and numerical methods to reveal information about its geometry. In particular, we propose a heuristic for numerically detecting points lying near the singular locus of the underlying variety.
- North America > United States > Colorado > Boulder County > Boulder (0.14)
- North America > United States > New York (0.04)
- North America > United States > Louisiana (0.04)
- (7 more...)
Revisiting Skin Tone Fairness in Dermatological Lesion Classification
Kalb, Thorsten, Kushibar, Kaisar, Cintas, Celia, Lekadir, Karim, Diaz, Oliver, Osuala, Richard
Addressing fairness in lesion classification from dermatological images is crucial due to variations in how skin diseases manifest across skin tones. However, the absence of skin tone labels in public datasets hinders building a fair classifier. To date, such skin tone labels have been estimated prior to fairness analysis in independent studies using the Individual Typology Angle (ITA). Briefly, ITA calculates an angle based on pixels extracted from skin images taking into account the lightness and yellow-blue tints. These angles are then categorised into skin tones that are subsequently used to analyse fairness in skin cancer classification. In this work, we review and compare four ITA-based approaches of skin tone classification on the ISIC18 dataset, a common benchmark for assessing skin cancer classification fairness in the literature. Our analyses reveal a high disagreement among previously published studies demonstrating the risks of ITA-based skin tone estimation methods. Moreover, we investigate the causes of such large discrepancy among these approaches and find that the lack of diversity in the ISIC18 dataset limits its use as a testbed for fairness analysis. Finally, we recommend further research on robust ITA estimation and diverse dataset acquisition with skin tone annotation to facilitate conclusive fairness assessments of artificial intelligence tools in dermatology.
- Europe > Switzerland (0.04)
- South America (0.04)
- Oceania > Australia (0.04)
- (6 more...)
- Research Report (0.64)
- Overview (0.48)
- Health & Medicine > Therapeutic Area > Dermatology (1.00)
- Health & Medicine > Therapeutic Area > Oncology > Skin Cancer (0.57)
Federated Model Aggregation via Self-Supervised Priors for Highly Imbalanced Medical Image Classification
Elbatel, Marawan, Wang, Hualiang, Martí, Robert, Fu, Huazhu, Li, Xiaomeng
In the medical field, federated learning commonly deals with highly imbalanced datasets, including skin lesions and gastrointestinal images. Existing federated methods under highly imbalanced datasets primarily focus on optimizing a global model without incorporating the intra-class variations that can arise in medical imaging due to different populations, findings, and scanners. In this paper, we study the inter-client intra-class variations with publicly available self-supervised auxiliary networks. Specifically, we find that employing a shared auxiliary pre-trained model, like MoCo-V2, locally on every client yields consistent divergence measurements. Based on these findings, we derive a dynamic balanced model aggregation via self-supervised priors (MAS) to guide the global model optimization. Fed-MAS can be utilized with different local learning methods for effective model aggregation toward a highly robust and unbiased global model.
- Europe > Switzerland (0.05)
- Asia > China > Hong Kong (0.05)
- Asia > Singapore (0.04)
- North America > United States > Texas > Grayson County (0.04)
- Research Report (0.82)
- Instructional Material > Course Syllabus & Notes (0.51)
- Instructional Material > Online (0.41)
- Health & Medicine > Therapeutic Area (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Omni-swarm: A Decentralized Omnidirectional Visual-Inertial-UWB State Estimation System for Aerial Swarms
Xu, Hao, Zhang, Yichen, Zhou, Boyu, Wang, Luqi, Yao, Xinjie, Meng, Guotao, Shen, Shaojie
Decentralized state estimation is one of the most fundamental components of autonomous aerial swarm systems in GPS-denied areas yet it still remains a highly challenging research topic. Omni-swarm, a decentralized omnidirectional visual-inertial-UWB state estimation system for aerial swarms, is proposed in this paper to address this research niche. To solve the issues of observability, complicated initialization, insufficient accuracy, and lack of global consistency, we introduce an omnidirectional perception front-end in Omni-swarm. It consists of stereo wide-FoV cameras and ultra-wideband sensors, visual-inertial odometry, multi-drone map-based localization, and visual drone tracking algorithms. The measurements from the front-end are fused with graph-based optimization in the back-end. The proposed method achieves centimeter-level relative state estimation accuracy while guaranteeing global consistency in the aerial swarm, as evidenced by the experimental results. Moreover, supported by Omni-swarm, inter-drone collision avoidance can be accomplished without any external devices, demonstrating the potential of Omni-swarm as the foundation of autonomous aerial swarms.
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
- Asia > China > Hong Kong (0.05)
- Asia > Middle East > Republic of Türkiye > Batman Province > Batman (0.04)
- (8 more...)
Online Learning for Non-monotone Submodular Maximization: From Full Information to Bandit Feedback
Zhang, Qixin, Deng, Zengde, Chen, Zaiyi, Zhou, Kuangqi, Hu, Haoyuan, Yang, Yu
In this paper, we revisit the online non-monotone continuous DR-submodular maximization problem over a down-closed convex set, which finds wide real-world applications in the domain of machine learning, economics, and operations research. At first, we present the Meta-MFW algorithm achieving a $1/e$-regret of $O(\sqrt{T})$ at the cost of $T^{3/2}$ stochastic gradient evaluations per round. As far as we know, Meta-MFW is the first algorithm to obtain $1/e$-regret of $O(\sqrt{T})$ for the online non-monotone continuous DR-submodular maximization problem over a down-closed convex set. Furthermore, in sharp contrast with ODC algorithm \citep{thang2021online}, Meta-MFW relies on the simple online linear oracle without discretization, lifting, or rounding operations. Considering the practical restrictions, we then propose the Mono-MFW algorithm, which reduces the per-function stochastic gradient evaluations from $T^{3/2}$ to 1 and achieves a $1/e$-regret bound of $O(T^{4/5})$. Next, we extend Mono-MFW to the bandit setting and propose the Bandit-MFW algorithm which attains a $1/e$-regret bound of $O(T^{8/9})$. To the best of our knowledge, Mono-MFW and Bandit-MFW are the first sublinear-regret algorithms to explore the one-shot and bandit setting for online non-monotone continuous DR-submodular maximization problem over a down-closed convex set, respectively. Finally, we conduct numerical experiments on both synthetic and real-world datasets to verify the effectiveness of our methods.
- Asia > Singapore (0.04)
- Asia > China > Hong Kong > Kowloon (0.04)
- North America > United States > Texas > Grayson County (0.04)